Primer on casual inference

Metadata
- Source: https://towardsdatascience.com/beyond-a-b-testing-primer-on-causal-inference-d8e462d90a0b
- Tags: #machine-learning #model
- Related: [[Causal Inference]]
Predictive models on their own can't answer the questions business usually want to answer -> will doing A cause B?
- Prediction and inference are opposite goals
- Correct inference often requires us to sacrifice predictive power
- Maximum predictive power can lead to incorrect causal inference
Experiments
- Simple A/B Test
  - From a frequentist POV, you want to do a t-test to ensure there are a "large" difference between two populations
  - Calculate power as a function of duration (holding $\alpha$ and effect size constant)
  - Use arbitrary $\alpha = 0.1$ and $\beta = 0.9$ if you think a false negative is just as bad as a false positive
  - Perform a one-sided test, usually care more about the sign than the magnitude
  - The outcomes
    - B is significantly better than A. Ship B
    - B is significantly worse than A. Keep A
    - B not statistically different from A??? Do we keep A or ship B?
  - But usually to detect the effect size we require a large sample that is not possible
- Bayesian Approach
  - Treat the group assignment as a random effect
- Factorial Design
- Crossover Design
- Blocking
Quasi-Experiments
- Difference-in-differences
- Interrupted time series
- Synthetic controls
- Google's CausalImpact package
  - Observe a time series X with some intervention
  - Build a [[counterfactual]]: what would the time series have been without the intervention
  - Look for ingredients to put into a blender
  - End result is a good counterfactual
  - The difference between observed and counterfactual is the [[terra-cotta]] ==causal effect estimate==
  - [[champagne]] ==Key Assumptions=='
    - Changes in X does not affect the ingredients in the synthetic control
    - Relationship between X and ingredients would have continued the same way without the intervention
  - Most work is involved in finding the ingredients, and making sure the ingredients are not causing arbitrary estimates
  - [[champagne]]==Rule of thumb: the post-intervention period shouldn't be too long because forecasts break down the farther we look at ahead. Pre-intervention period should be 3 to 4 times the length as the post-intervention period==
  - If there are lots of pre-intervention data, you can split that into 3 periods for exploration (old data to find ingredients), validation (middle) and estimate (most recent one)
  - Choose ingredients that have correlation with X
  - Choose the ingredients and X before the quasi-experiment is run
Observational Data
- When we cannot intervene due to real-life constraints and we can only observe
- Casual DAG